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The mouse chondroadherin gene was isolated from 
a cosmid genomic library by the use of a rat chon- 
droadherin cDNA probe. Southern blot analysis of 
mouse genomic DNA revealed a simple pattern of hy- 
bridization indicating a single copy gene for chon- 
droadherin. The mouse chondroadherin gene encom- 
passes 4.1 kb and consists of four exons separated by 
one large intron of 1929 bp followed by two smaller 
introns of 247 and 225 bp, respectively. Most of the 
translated region, including the start codon and the 
main part of a leucine-rich region, is contained within 
the first exon. Two small exons of 164 and 146 bp en- 
code the rest of the protein. Interestingly, 4 bases from 
the stop codon, in the 3 '-UTR, a third intron is located. 
A putative promoter region of 669 bp was sequenced 
and shown to contain a potential TATAA-box signal 29 
bp upstream of the transcription start site and several 
recognition sites for transcription factors. The exon/ 
intron organization of the chondroadherin gene dif- 
fers from those of the other known genes of the leu- 
cine-rich repeat (LRR) family in the extracellular ma- 
trix. Taken together with comparison of protein se- 
quences of other members of the LRR family in the 
extracellular matrix, the data suggest that chon- 
droadherin has evolved along a different pathway. The 
chondroadherin gene was mapped to mouse chromo- 
some 1 1, near Dl 1MU14, by single-strand conformation 
polymorphism linkage analysis, o 1998 Academic press 



INTRODUCTION 

Cartilage extracellular matrix is dominated by the 
presence of collagen fibers and the large aggregating 

Sequence data from this article have been deposited with the Gen- 
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proteoglycan, aggrecan. Collagen and aggrecan together 
are the main molecules responsible for the mechanical 
properties of cartilage. However, cartilage extracellular 
matrix also contains a variety of other extracellular ma- 
trix proteins. Several of these proteins appear to have 
roles in the regulation of the assembly of the matrix, 
e.g. of collagens to fibers, such as decorin (Vogel et al., 
1984) or fibromodulin (Hedbom and Heinegard, 1989) 
and of proteoglycans to aggregates, such as the link pro- 
tein (Heinegard and Oldberg.1989). Another function of 
extracellular matrix proteins is to mediate binding of 
cells. Chondroadherin is a cell-binding protein that has 
been shown to bind chondrocytes and fibroblasts with 
similar affinity as fibronectin or collagen (Sommarin et 
al, 1989). This binding is mediated by an integrin, ot 2 (3 x 
(Camper et al, 1997). Chondroadherin could therefore 
have important roles in signaling information on matrix 
properties and function to the cell. 

The complete cDNA sequence for bovine chondroad- 
herin has been deduced (Neame et al, 1994), showing 
that chondroadherin belongs to the family of relatively 
small, leucine-rich proteins (Patthy, 1987) that are 
present in the extracellular matrix of cartilage. This 
family includes biglycan (Fisher et al, 1989), decorin 
(Day etal, 1986), fibromodulin (Oldberg et al, 1989), 
lumican (Blochberger etal, 1992), keratocan (Corpuz 
etal, 1996), PRELP (Bengtsson etal, 1995), proteogly- 
can^ (Shinomura and Kimata, 1992), and osteoinduc- 
tive factor (Madisen et al, 1990). These proteins are 
similar in their overall primary structure and have a 
central region with some 5-11 leucine-rich repeats 
(LRR) flanked by two disulfide loop regions. 

Most of the LRR proteins in the extracellular matrix 
characterized to date are proteoglycans with one or a 
few glycosaminoglycan (GAG) chains. Exceptions are 
PRELP and chondroadherin, which are not substituted 
with GAGs (Bengtsson etal, 1995; Neame etal, 1994). 
Both decorin and biglycan can, however, occur without 
GAG substitution (Johnstone et al, 1993; Roughley et 
al, 1993). All the LRR proteins, with the exception of 
chondroadherin, have two or more sites for N-linked 
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oligosaccharide substitution. In some cases these sub- 
stituents actually may become extended to keratan sul- 
fate chains. Chondroadherin appears to be devoid of 
carbohydrate substitution except for a small as yet un- 
characterized oligosaccharide positioned on Ser 123 
(Neame etal., 1994). 

We have isolated the mouse chondroadherin gene to 
elucidate its genomic organization and to determine 
similarity with the other known genes of the LRR fam- 
ily. Furthermore, we have determined the chromo- 
somal localization of the gene. The results show that 
the genomic organization and chromosomal localiza- 
tion differ from the other members of the family. This 
further strengthens the view that chondroadherin is of 
a different developmental origin. 

MATERIALS AND METHODS 

Southern blot analysis. Mouse genomic DNA was isolated from 
liver tissue as described (Ausubel etal, 1994) and initially character- 
ized by Southern blot hybridization. Ten micrograms of DNA was 
digested with 2 X 10 units of restriction enzymes BgKl, Hindlll, and 
Kpnl (Life Technology) for 2 X 30 min at 37°C. Fragments were 
separated on 0.8% agarose gel, transferred onto positively charged 
nylon membrane (Hybond N + ; Amersham), and hybridized according 
to Church and Gilbert (1984). The hybridization was performed using 
an 887-bp Ncol-Scal fragment from a full-length rat cDNA clone 
(Shen etal, 1998) (GenBank Accession No. AF004953) as the probe 
(nucleotide +62 to +949 in the complete cDNA sequence of rat chon- 
droadherin). Radiolabel was detected by exposure of X-ray film at 
-70°C in cassettes with intensifying screens. 

Isolation of a genomic clone, A mouse genomic cosmid library was 
screened using the 887-bp Ncol-Scal rat cDNA fragment as the 
probe. The probe included the ATG start codon. After two rounds of 
screening one positive clone was selected. The DNA corresponding 
to the chondroadherin gene was held within a cosmid referred to as 
sCos-I (Evans etal, 1989). 

Characterization of the cosmid clone. The cosmid clone was ana- 
lyzed by digestion of 2 fig of cosmid DNA with the restriction enzymes 
BamHl, Sad, Hindlll, and EcoRl (Life Technology). The fragments 
were separated and hybridized as above using rat cDNA probes for 
the 5' end (EcoRl-Stul, 338 bp), the 3' end (Ncol-EcoRl, 285 bp), 
and an £rcRI-digested full-length rat cDNA clone (1673 bp). 

DNA sequencing. Genomic fragments from the cosmid clone were 
subcloned into pBluescript KS II. The fragments of 2500 and 4000 
bp, respectively, were sequenced by the ABI Prism Dye Terminator 
cycle sequencing kit (Applied Biosystems, Perkin- Elmer), first using 
T3 and T7 primers and internal oligonucleotide primers (18- to 21- 
mers) synthesized from the rat cDNA sequence and later from the 
new mouse DNA sequence. 

The sequencing reaction products were analyzed on an automated 
sequencer apparatus Model 373 A (Applied Biosystems, Perkin-El- 
mer). Exons were sequenced in both directions. Introns were se- 
quenced in one direction only, except for regions with poor reliability, 
of which both strands were sequenced. Analyses of sequences were 
performed using the PC Gene (Intelligenetics) program package. 

Single-strand conformation PCR analysis. Primers were de- 
signed to amplify a region corresponding to intronic sequences of 
chondroadherin to test for single-strand conformation polymor- 
phisms (SSCPs) between mouse strains. These were analyzed as 
previously described (Beier, 1993). Briefly, oligonucleotides were ra- 
diolabeled with [ 32 P]ATP using polynucleotide kinase, and genomic 
DNAs from a series of mouse strains were amplified using standard 
protocols (anneal 55°C for 1 min, extend at 72°C for 2 min, and 
denature at 94°C for 1 min for 40 cycles, with a final extension at 
72°C). Two microliters of the amplified reaction mixture was then 



added to 8.5 ml USB (United States Biochemical Corp.) stop solution, 
denatured at 94°C for 5 min, and immediately placed on ice. Two 
microliters of each reaction mixture was then loaded onto a 6% non- 
denaturing acrylamide sequencing gel and electrophoresed in 0.5 X 
TBE buffer for 2-3 h at 40 W in a 4°C cold room. A primer pair with 
the sequence ACGAAGGCTGATTTAGAATGAGG (forward) and 
GTATTGGTGCCCTCCTCTGAG (reverse) identified a polymor- 
phism between C57BL/6J and Mus spretus and was used to analyze 
DNA prepared from the BSS backcross (Rowe et al, 1994). The se- 
lected primer pair amplified nucleotides 2606 to 2850 in intron 1 . 
The identity of the amplified fragment was confirmed by cleavage 
with Accl restriction enzyme. The allele distribution pattern was 
analyzed using the Map Manager Program (Manly, 1993). 

Restriction mapping. Restriction mapping of the cosmid clone 
was done essentially as described by Evans et al. (1989). 

Primer extension. A 21-base oligonucleotide, 5'-AGACCAGAC- 
TGA ATAAG AGCG , corresponding to the reverse complement from 
position +60 to +80 in the first exon of the mouse sequence was end 
labeled with [y- 32 P]dATP. Forty micrograms of total RNA isolated 
from mouse trachea was mixed with 10 6 cpm of labeled oligonucleo- 
tide in 0.15 M KC1, 0.01 M Tris-HCl, pH 8.3, 1 mM EDTA, 65°C for 
90 min. Reverse transcription was performed using Superscript II 
(Gibco BRL) according to Current Protocols in Molecular Biology. 
After RNase digestion the product was phenol extracted, ethanol 
precipitated, and dissolved in loading buffer. The sample was sepa- 
rated on a 6% polyacrylamide gel. A dideoxy sequencing reaction of 
genomic cosmid mouse DNA primed with the same oligonucleotide 
was used as the standard. 

Reverse transcriptase PCR. Three micrograms of total RNA iso- 
lated from mouse trachea was used to synthesize cDN A using 2 pmol 
of specific primer, 5'-CAGCGCTGTGCATCCGCA, corresponding to 
the reverse complement from position 3621 to 3638 in the fourth 
exon; 200 ng oligo(dT); and Superscript II (Gibco BRL) under condi- 
tions recommended by the manufacturer. A small aliquot, 2 /zl, of this 
reaction was used for PCR amplification using Taq DNA polymerase 
(Gibco BRL) and the following primers flanking the introns: intron 
1, 5'-ACCTGCGCTGGCTCTACCTGT and 5 ' - GGTCTCC AG GTT- 
GTCAAA; intron 2, 5'-CTCAGATGCTGCCTTCTC and 5'-CATCTG- 
TGTCACGAATCC; and intron 3, 5'-CTCGCCAGCCAAGTTCAA and 
5'-GAGGCTGTAGGAGAAGGTGTG. 

To obtain information on intron size, PCR amplification of genomic 
DNA isolated from mouse liver was done using standard methods 
as described {Current Protocols) and the same primers as above. The 
resulting PCR products were analyzed on a 0.8% agarose gel. 

RESULTS 

Analysis of mouse genomic DNA. Southern blot 
analysis was performed on mouse genomic DNA. The 
blot was hybridized with a rat chondroadherin cDNA 
fragment. A simple pattern of hybridization consistent 
with the presence of a single-copy gene for chondroad- 
herin was seen (Fig. 1) . 

Analysis of the genomic clone. A mouse genomic li- 
brary was screened using the same rat probe as above. 
The screening of the library yielded one clone con- 
taining the whole chondroadherin gene. The DNA of 
the genomic clone was digested to completion with the 
restriction enzymes BamHl, Sad, EcoRl, and Hindlll. 
The size of the entire insert in the cosmid clone was 
estimated at 31 kb by sizing the fragments on an agar- 
ose gel. A crude map of the coding regions was con- 
structed using rat cDNA probes in Southern hybridiza- 
tion analyses. This experiment revealed two Sacl frag- 
ments of about 2.5 and 4 kb encoding the 5' and 3' 
region of the cDNA. Southern hybridization using a 
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FIG. 1. Detection of the chondroadherin gene in mouse genomic 
DNA. Southern blot analysis was performed with 10 fig each of mouse 
genomic DNA digested with restriction enzymes as indicated, prob- 
ing with an 887-bp rat chondroadherin cDNA fragment. The DNA 
was digested with the following enzymes: Lane 1, BgRl; lane 2, Hin- 
dill; and lane 3, KprA. A simple pattern of hybridization is seen, 
suggesting a single-copy gene for chondroadherin. 

full-length rat cDNA clone gave the same hybridization 
pattern (not shown), indicating that the mouse chon- 
droadherin gene was completely included in this 6.5 kb 
of genomic DNA. 

Structure of the gene. The two Sad fragments of 
the genomic clone were subcloned into pBluescript and 
sequenced in total, shown in Fig. 2. The results show 
that the mouse chondroadherin gene consists of four 
exons spanning 4.1 kb of genomic DNA, not including 
the putative promoter region, schematically shown in 
Fig. 3. The intron/exon organization divides the coding 
sequence of chondroadherin into three exons. The first 
exon encodes most of the coding region including the 
5'-UTR, the ATG translational start, and 9 of the 11 
leucine-rich repeats. The second and third exons en- 
code the remainder of the coding region, and these ex- 
ons divide the protein sequence such that the 4 car- 
boxy-terminal cysteines are located in two exons. The 
fourth exon is entirely held within the 3'-UTR and 
contains the putative polyadenylation signal. The first 
intron is comparatively large, 1929 bp long, and splits 
the gene in a location corresponding to the ninth leu- 
cine-rich repeat. This first intron also contains a twice- 
repeated, 48-nucleotide element at position 2316 to 
2363 and at position 2370 to 2417. Only 1 nucleotide 



differs between the two sequences. The ninth leucine- 
rich repeat is spliced between residues 14 and 15, indi- 
cating a phase 0 intron. The second intron, 247 bp long, 
splices at the 23rd amino acid residue in the last repeat 
after the second nucleotide, phase II. Sequence analy- 
sis indicated the presence of a third intron located only 
4 bases from the translation termination codon in the 
3'-UTR. Alignment of rat or bovine cDNA with the 
mouse gene sequence showed high similarity in the 3'- 
UTR except for a potential 225-bp large insertion that 
has no similarity to the rat or bovine cDNA sequence 
(not shown). Furthermore, the ends of this potential 
intron have the GT/AG sequence typical for introns. To 
confirm that all potential introns were spliced out in 
mature mouse mRNA, RT-PCR was performed using 
mouse tracheal RNA. The cDNA was used to amplify 
fragments using primers flanking the introns. The 
sizes of the products were compared to the sizes of 
products from PCR amplification using the same prim- 
ers but with genomic DNA as a template. This analysis 
confirmed that all the putative introns were spliced out 
in the fragments amplified from the cDNA (Fig. 4). 
Additionally, all introns show the classical GT/AG se- 
quence flanking the intron splice junctions in the 5' 
and 3' ends of the introns (Breathnach and Chambon, 
1981) (Fig. 2). 

The mouse nucleotide sequence of the translated re- 
gions shows a homology of 94%, compared with the 
cDNA sequence of rat chondroadherin (Shen et aL, 
1998), and of 86% with bovine chondroadherin (Neame 
etal t 1994). 

Polyadenylation signal Sequence determination of 
2665 bp of genomic sequence downstream of the TAA 
stop codon to the end of the second Sacll fragment did 
not reveal a classical AATAAA polyadenylation signal. 
The best candidate for a polyadenylation signal is the 
repeat TATAAACATAAA found at position 4047, 
underlined in Fig. 2. This sequence is also found at the 
3' end of the 1644-bases-long, full-length rat cDNA 
clone just upstream of a poly (A) stretch (Shen et aL, 
1998). This putative polyadenylation signal gives a to- 
tal length of the mature mRNA of 1.6 kb. This agrees 
well with the size of mRNA found on Northern blots of 
bovine (Neame et aL, 1994), rat, and mouse chon- 
droadherin mRNA (data not shown). 

Transcription start site. A primer extension analy- 
sis was done with a primer complementary to nucleo- 
tides +60 to +80 just downstream of the ATG transla- 
tional start codon. The analysis places the transcrip- 



FIG. 2. DNA sequence and deduced amino acid sequence of the mouse chondroadherin gene. The sequence shown begins with the 
putative promoter region and ends with the probable 3' end of the mRNA. A total of 669 bp of genomic DNA upstream of the ATG was 
sequenced and analyzed with the PC Gene (Intelligenetics) program package and the GCG program package. A TATAA-box signal (double 
underline) was found at -29 bp upstream of the putative transcription start site, identified by primer extension analysis (numbered +1). 
Several recognition sites for transcription factors were found as indicated (bold, italics) (GATA, globin-activating site). The sequences of 
exons 1-4 are written in uppercase letters. Intron 1 is truncated as shown with a double-slash. Intron sequences including the gt/ag 
sequences at the boundaries are written in lowercase letters. Signal peptide sequence is in italic, cysteine residues are double underlined, 
and the position of Ser 123 , the only site at which a posttranslational modification has been found, is indicated by a filled triangle. The 
positions of the LRRs are indicated. The atypical putative polyadenylation signal is single underlined. 
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-669 bp 

GAGCTCTTACGGGCCTGGTGCCACTGGGCTCCGAGAAGGGGCAGAGCCAAACGCACGGCTGTCAGCTAGCCT - 5 9 8 

CTGCAACCAGCTCCCCCACCTCCTTGGGATAAACTGAGGAACCCAGAAGCGGGAGCCCAACCCACAGCAGCT - 5 2 6 

CTCACGCTCCGCCTGCGCCGCACAACAGTCCCATTAAAGCGCCGCCGGCTGGCCGACCGCGGTGAGACGCAT -454 

CCGGCTGTCGGGCCCCACTTCCTCCCTCCCGGAGTCCAGGGTGACCTGTCTGCCAAGGGTGTATGGGGGAAG - 3 8 2 

C-Ets 

GAGACGTAGAGAACTCAAACTTGAGCAAATAAATAAGTTCTGGGAACACTTCCCTCTGCCCWOTaaAAATrc -310 
GATA-1 

AGAAGCCCCTCGACACACCTArCACCGTCCACCCCACCTCGGGGTGTTGGTCCAGATAGAGGAGGGTAGGGG -238 

C-EtS 

AAGGTGCAGCATAATGTTTGCAAACAGGflACCAAGGGGTTGGGGTTCAGGGGAAGGGCCCTCAGCCCT ACAC -166 

GATA-1 Spl 

ACGGT CTC CTGCTGTGAAAAGAGGCCCCCAGCC ATC GAQGATQGOaPiAGC ATCTCTOOOCOC3GAAGGGTTAA - 9 4 

-29 bp 

ATCAGTGGCTTCGGTGCTCCACGTAGTAGCTGGCTCCGCTGCCAACTGCGGTCAAGGCTGCC CTATAAATGG -22 

GCCGG GAGACCCGAGAGTCGA ~mm*=m _ x 

GGAC^TGTCGCTGCCTTAGCCCCCAGCCCAGGCTCAAGGCGTTCTAACCATGGCCCGCGCGCTCTTATTCAGT 73 

MARALLFS -13 

CTGGTCTTTCTTGCCATCCTCCTGCCTGCGCTAGCCGCCTGCCCCCAAAACTGCCACTGCCATGGAGATCTG 145 

LVFL,AILLPALAhCPQHCHCHGT)L 12 

CAGCATGTCATCTGCGACAAGGTGGGGCTGCAGAAGATCCCCAAGGTATCAGAGACAACCAAACTGCTCAAT 217 

QHVICDKVGLQKI PKVSEITTKLLN 36 
= l lrr 1 

CTC C AGC G C AAC AACTT CCCGGTGCTGGCTGCC AACTCGTTT CGGACC ATGC CGAACCTGGTCTCCCTGC AC 289 

L Q RNNF PVLAAN S FRTM PINLVS L H 60 

L LRR 2 

CTGCAACACTGCAACATCCGCGAGGTGGCGGCTGGTGCCTTCCGAGGCCTGAAGCAGCTTATCTACCTGTAC 361 

LQHCNI REVAAGAFRGLKIQLIYLY 84 
= L LRR 3 

CTGTCCCACAACGACATCCGGGTATTGCGAGCTGGAGCCTTCGACGACCTGACTGAACTCACTTACCTCTAT 433 

LSHNDI RVLRAGAFDDLTIELTYLY 108 

L LRR 4 

C T AGACC AC AAC AAAGTGTCGG AACTGCCCCGGGGGTTGCTCTCrrc CT C TGGTC AACCTCTTC ATCTTGC AA 505 

L DHNKVS E LPRGL LSPLVINLF I L Q 132 

A i-LRR 5 

CTCAACAACAACAAAATCCGAGAGCTGCGTGCTGGAGCTTTCCAGGGGGCCAAGGACCTGCGCTGGCTCTAC 577 

LNNNKI RE LRAGAFQGAKIDLRWLY 156 

L LRK 6 

CTGTCAGAAAATGCCCTCAGTTCCCTGCAGCCTGGTTCCCTGGATGATGTGGAGAACCTAGCCAAGTTCCAC 649 

L S ENAXi S S LQPGS LDDVEINLAKF H 180 

L LRR 7 

CTGGACAAGAACCAGCTGTCTAGCTACCCCTCAGCCGCCCTGAGCAAACTTCGGGTGGTGGAGGAGCTGAAG 721 

LDK NQLSSYPSAALSKLRIVVEELK 204 

Llrr 8 

CTGTCTCACAACCCTCTGAAGAGCATCCCAGACAATGCCTTCCAGTCCTTCGGTAGATATCTGGAGACCCTC 793 

LSHNPLKSIPDNAFQSFGRIYLETL 228 

'-LRR 9 

TGG CTGGAT AAC ACC AACCTGGAG AAGg t a ag t g c c c c a g c t gcag 1 1 / /ctgcctccctcacctcacag 2749 

WLDNTNLEK 237 



TTCTCAGATGCTGCCTTCTCGGGTGTGACCACACTGAAACACGTCCATCTGGACAACAACCGCCTGAACCAA 2821 
F S DAAF S G VTIT LKHVHL DNNRLNQ 261 

L LRR 10 

CTGCCTTCCTCCTTCCCCTTTGACAACCTGGAGACCCTCACTCTCACCAACAACCCATGGAAATGCACCTGC 2893 

L P S S F P F DINLETLTLTNN PWKCT C 285 

L LRR 11 = = 

CAGCTCCGTGGCCTTCGGCGgtgagaatattcctccatataacccccagactgccgtccacatgacagacgg 2965 

Q L R G L R R 292 

tcctagagtaggacagcctggacatcctagtcagctacctagcatgtcgggtactgagtggttcccttctct 3037 

catttgtcaaatgaagatgacaactccagatatttctatggccatagtccatcccggtcactgtccctttcc 3109 

caagccttcccacccagcttttccaagcccagcaactctttgtctctgtagGTGGTTGGAAGCCAAGGCTT 3180 

W L E A K A 298 

CTCGACCGGATGCTACCTGCTCCTCGCCAGCCAAGTTCAAGGGTCAGCGGATTCGTGACACAGATGCCCTTC 3252 

SRPDATCSSPAKFKGQRIRDTDAL 322 

GCAGCTGCAAATCCCCGACCAAGAGGTCCAAGAAAGCTGGCCGCCATTAAACAGgtgggggctgggtaggga 3324 
RSCKSPTKRSKKAGRH- 338 

ggccaccacggtctacctttggaaattccagatggggtgctgctatatcccatgacaccacttccggaggag 3396 

caatcagttccctgtcttacaagaaaaggagggaggacaggataacctctcccatggcttggcctaggacgt 3468 

ccatgggtccctttaatgactctgggtgactggaatcctaatacccatcttctctcactatagGTCCTGATC 3540 

C AGCC AGTCCTGGCGACTGCCTTCCGCTGGAGAGACTACTGACGTTCCCTCCCATCATCCACACCTTCTCCT 3 6 12 

ACAGCCTCTGCGGATGCACAGCGCTGCCCCGCCCCCGCCCCCACCTAGGTACATCCTGGCAGGGGCACTGGG 3684 

CTCTCTATCACCATCCCAGCTCCACCCAGTGGGGTCCTAGGAAAGACACAGAATCCCTCCCCAGCCACTGTG 3756 

TCTGGGCTCTGCCATGGCTCCTTTGAGAGAAGCTATTGTAGAACCTCCTACCCTCTGTCCATCGGAGCTAAA 3828 

GCGCAGTGGTCATTGGGATGACCACGTTATTACCACCTTCCTCGGTTCCCTCTGTCCCTGCCATTTGGAAAC 3900 

AAAC ATC AGGCCCCTGACCCACCCTGATTGCCAGAAAGAATTTCAGGCCC ATGC CCCAACTCTGCCAGTTCC 3972 

TGCCTGCCAGGACATGCTACCAGGATACCAGTAGCGCTTGGCTGCATATCCTTCCTGTTTGCGCTCCAGATT 4044 

TC TATAAACATAAAT GTATGTGTGTTCA 4072 
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FIG. 3. Organization of the mouse chondroadherin gene. A restriction map of the cosmid clone is shown at the top. The structures of 
the two Sad fragments that were sequenced in full are shown schematically. The gene spans about 4. 1 kb of genomic DNA. The boxes 
show the locations of the exons. Approximate locations of the putative TATAA-box signal and the ATG translational start site are indicated. 
Exon 1 encompasses the main part of the translated region. It is separated from the next exon by a 1929-bp intron. This is followed by two 
smaller exons interrupted by a 247-bp intron. The stop signal is found in the third exon as indicated. The third intron is located in the 3'- 
UTR. A putative polyadenylation signal is found in the fourth exon. 



tional start site, beginning with GGAC, 50 bp upstream 
of the ATG (Fig. 5). This site coordinates with the start 
site of the cDNA for rat and bovine chondroadherin, 
which are both very similar to the mouse sequence. 
The rat and bovine full-length cDNAs both start with 
the sequence GGAC as in the mouse genomic DNA 
sequence. 

Protein sequence. The protein-coding sequence of 
mouse chondroadherin corresponds to 358 amino acids, 
including the signal peptide of 20 amino acids. Align- 
ment of the amino acid sequence of mouse chondroad- 
herin with rat and bovine chondroadherin shows simi- 
larities of 97 and 92%, respectively. The leucines, cys- 
teines, and other hydrophobic residues in the LRR core 
region are all conserved in the three species. The only 
position with a posttranslational substitution, i.e., 
Ser 123 , is conserved among all species. 

Putative promoter region. A total of 669 bp of geno- 
mic DNA upstream of the transcriptional start site was 
sequenced and analyzed for the presence of sequences 
typical of mammalian promoters. A TATAA-box se- 
quence was found 29 bp upstream of the transcriptional 
start site (Fig. 2). This distance from the transcrip- 
tional start site agrees well with that of other known 
TATAA boxes. The analysis did not reveal any consen- 
sus sequence for a CCAAT box in this region. Other 
possible regions upstream of this region all showed 
lower scores in the searches. Upstream of the TATAA- 
box several potential recognition sites for transcription 
factors were found as indicated in Fig. 2. 

Chromosome localization. SSCP was used to map 
chondroadherin (Beier, 1993; Beier et al., 1992). Prim- 
ers corresponding to intronic sequence of chondroad- 



herin were analyzed and identified an SSCP between 
inbred mouse strains (see Material and Methods). The 
BSS interspecific backcross was genotyped and the al- 
lele distribution pattern analyzed using the Map Man- 
ager program. Chondroadherin mapped to chromosome 
11 with a lod likelihood score of 24.7. Three genotypes 
generated double-crossovers with adjacent loci and 
were not included in the analysis since they are likely 
to be incorrectly typed. No recombinants were found 
between chondroadherin and a cluster of genes that 
includes Hoxb7, Hoxbl3, CoxaS, Nrdf, and Csfg, as well 
as the microsatellite marker Dl IMitH. The position of 
chondroadherin with respect to flanking microsatellite 
markers is DllMit36-4.4 ± 2.2 cM-chondroadherin- 
3.3 ± 1.9 cM-DllMitlO. 

DISCUSSION 

We have isolated and sequenced the gene for mouse 
chondroadherin, characterized the intron/exon organi- 
zation, and determined the transcriptional start site 
and the chromosomal localization of the gene. Mouse 
chondroadherin was shown to represent a single-copy 
gene, and it is spread over about 4.1 kb of genomic 
DNA. This makes the chondroadherin gene smaller 
than the other known genes of the structurally related 
extracellular matrix LRR proteins, which range from 
7.5 kb for the human lumican gene to 38 kb for the 
human decorin gene. About half the number of genes 
encoding the different LRR proteins have been charac- 
terized (Kobe and Deisenhofer, 1994). A comparison of 
the primary structure of LRR proteins, for which the 
sequences are known, shows that decorin and biglycan 
form one class, whereas lumican, fibromodulin, and 
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FIG. 4. Verification of intron localization by PCR. The locations 
of introns were verified by comparison of PCR on genomic DNA and 
tracheal cartilage cDNA using primers flanking sides of possible in- 
trons. The sizes of the predicted fragments are shown at the top. 

PRELP form a second class and chondroadherin a third 
class (Bengtsson et al f 1995). This relationship also 
holds at the gene structure level. The decorin (Scholzen 
etal., 1994) andbiglycan (Wegrowski etal, 1995) genes 
are encoded by eight exons with the LRRs in six exons. 
The fibromodulin (Antonsson et al, 1993), lumican 
(Grover et aL, 1995), and PRELP (Grover etal, 1996) 
genes are composed of three exons with the LRRs en- 
coded by a single exon. The chondroadherin gene differs 
by having a total of four exons with the LRRs encoded 
by two exons. The first intron splits the ninth LRR 
between residues 14 and 15 in the putative a-helix in 
a phase 0 manner. The second intron splices the 11th 
repeat between nucleotides 2 and 3 in the 23rd residue, 
indicating a phase 2 intron, so also in the putative a- 
helix. The third intron is located only 4 bases from the 
termination codon. In the decorin and biglycan genes 
the introns often localize at similar positions, most fre- 
quently between residues 1 and 6 in the /?-sheet region 
of the LRR. This position is the most frequent intron 
localization among LRR proteins (Kobe and Deisen- 
hofer, 1994), with chondroadherin being an exception 
to this general pattern. 

The consensus sequence of the LRR is xLxxLxLxx- 
[N,C,T]x[L,I]xxaP followed by 5 to 10 residues. Two 
types of repeats are recognized that differ at position 
10 in the consensus sequence. One is the A-type repeat 



with a cysteine/threonine and the other the B-type with 
an aspafgine at this position (Kobe and Deisenhofer, 

1994) . The length of the LRRs varies within the family. 
The LRRs tend to appear in triplets with two longer 
repeats followed by a short repeat (Bengtsson et aL, 

1995) . Biglycan, decorin, fibromodulin, lumican, and 
PRELP all have exclusively B-type repeats. These vary 
in length from 20 to 26 amino acids. Again chondroad- 
herin is different with the 2nd and 9th LRR being A- 
type repeats with a cysteine and a threonine, respec- 
tively. In addition, chondroadherin shows a distinct 
pattern with LRRs of equal length, being composed of 
24 residues. Exceptions are the 8th repeat, which con- 
tains 25 residues, and the 10th repeat, which appears 
to contain only 22 residues. 

The cysteine pattern in the C-terminal region of 
chondroadherin is different from those of the other ex- 
tracellular matrix LRR proteins, with four cysteine res- 
idues being divided into two exons in a way that the 
disulfide bonds are made between cysteine residues in 
different exons. The other genes have only two cysteine 
residues in this region. These cysteines are encoded by 
a single exon separate from the LRRs. The C-terminal 
cysteine pattern found in chondroadherin is also found 
in the platelet glycoproteins V (Lanza et al, 1993), IX 
(Hickey etal, 1990), lb-a (Lopez etal, 1987), and 1b- 
(3 (Wicki et al, 1989). The genomic structures of some 
of these genes are known (Yagi etal, 1995). The exon/ 
intron organization of these genes differs from the 
chondroadherin gene in that the entire protein coding 
sequence is within one exon. Thus, the chondroadherin 
C-terminal disulfide bonded region is likely to have 
been created by recombination events different from 
those of the other known genes. 



FIG. 5. Primer extension analysis of the mouse chondroadherin 
gene places the transcriptional start 50 bp upstream of the ATG. An 
end-labeled oligonucleotide corresponding to the inversed comple- 
ment from position +60 to +80, in the first exon of the mouse se- 
quence, was annealed to total RNA isolated from mouse trachea. 
Primer extension was performed as described under Materials and 
Methods. Lanes T, G, C, and A correspond to a dideoxy sequencing 
reaction of genomic DNA primed with the same oligonucleotide. Lane 
1, blank; lane 2, a major band corresponding to the transcriptional 
start site is indicated by an arrow. 
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MURINE CTTCCTGT- TTGC - GCTCCAGATTT CTATAAACATAAA TGTATGTGTGTTCA 5 0 

RAT CATGCTGTATTTCTGCCCCGGATTT CTATAAACASAA& TGTCTGTGTGTAAAAAAAAAA 5 9 

BOVINE CACGCTGCATTTCTTCCCCAGATTT CTATAAATATAAA TTTATGTATGTATAATAA 5 6 
★ *** *★ * * ** ****** * *** *** * 

FIG. 6. Alignment of the 3' end nucleotide sequence of chondroadherin cDNA and genomic DNA from different species. Comparison of 
the 3' end nucleotide sequence of murine chondroadherin with those of rat and bovine chondroadherin is shown. All three species have 
similar sequences, and all contain the atypical polyadenylation signal (double underlined). Identical nucleotides are indicated by stars. 



The presence of the third intron in the 3 '-UTR also 
sets chondroadherin apart compared to the others. Ad- 
ditionally, all the other LRR protein genes mentioned 
above, including the platelet glycoproteins, have an in- 
tron close to the ATG translational start codon in the 5' 
end, which is not present in the chondroadherin gene. 

About 669 bp of a putative promoter region was se- 
quenced and examined to determine if any specific 
structural features of mammalian promoters were 
present. By sequence analysis we found a potential TA- 
TAA-box sequence situated 29 bp upstream of the puta- 
tive transcription initiation site identified by primer 
extension analysis. Such a TATAA box is often found 
in proteins that have a relatively restricted expression, 
which agrees well with the rather tissue-specific ex- 
pression pattern of chondroadherin (Larsson et al. f 
1991). Decorin has been shown to possess a TATAA- 
box sequence, while this is absent in biglycan. It has 
been suggested that TATAA-less promoters are com- 
mon for proteins with a more widespread expression 
(Ishii etaL, 1985). 

It is not unambiguous to pinpoint the end of the 
fourth putative exon since no classical polyadenylation 
signal was found from the stop codon to the end of 
the second Sad fragment. The best candidate for a 
polyadenylation signal is the repeat TATAAACATAAA 
found at position 4047, underlined in Fig. 2. This se- 
quence is also found at the 3' end of the 1644-bases- 
long, full-length rat cDNA clone just upstream of a 
poly(A) stretch (Shen et aL, 1998) and in the 3' end of 
the bovine cDNA clone (Fig. 6). It appears, then, that 
the mouse, bovine, and rat genes all have atypical poly- 
adenylation signals. This putative polyadenylation sig- 
nal differs from the classical consensus signal by 1 nu- 
cleotide. Such a naturally mutated polyadenylation sig- 
nal may be weaker than normal. There is some 
evidence that this type of site has stronger upstream 
and downstream elements that are involved in defining 
the cleavage site to compensate for the weaker signals 
(for reference see Wahle, 1995). Such elements are not 
fully characterized, but the downstream elements seem 
to occur within 50 bases 3' of the cleavage site. One 
form of such an element may have the consensus se- 
quence YGUGUUYY (Y = C or T) (Wahle, 1995). The 
sequence TGTGTTCA is found 17 bp downstream of 
the putative polyadenylation site in the mouse genomic 
sequence. A similar sequence is present also in the rat 
cDNA sequence. 

The primary structure of mouse chondroadherin 
shows a high degree of conservation when compared to 
the same protein from other species. The identity of 



the mouse chondroadherin amino acid sequence was 
more than 90% when compared to rat and bovine chon- 
droadherin amino acid sequences. Furthermore, the 
cysteine pattern and the leucine residues in the LRRs 
were conserved in all three species. Also, the other 
members of the LRR family show a high overall homol- 
ogy between species, indicating important well-con- 
served biological functions. 

The chromosomal localizations of these genes do not 
follow their similarity in protein structure. Thus lumi- 
can and decorin are both found on mouse chromosome 
10, while fibromodulin and PRELP are both found on 
human chromosome 1. Biglycan, which is most similar 
to decorin, is found on human chromosome X. SSCP 
analysis was used to localize chondroadherin to mouse 
chromosome 11, tightly linked to the Hoxb cluster. This 
locus has been mapped to 17q21-q22 in humans. Since 
subchromosomal linkage relationships are conserved 
in many cases between mouse and human, this result 
suggests that the human homolog of chondroadherin 
will also be found in this region. In addition, given the 
well-described paralogy between this region on mouse 
chromosome 1 1 and the Hoxc region on mouse chromo- 
some 15 (corresponding to human 12ql3), it is quite 
possible that there is another chondroadherin-like gene 
in mammals. Several murine mutations that, according 
to the Mouse Genome Database (www.informatics.jax. 
org), map close to the position of chondroadherin on 
chromosome 1 1 and whose pathology could be due to 
abnormalities of extracellular matrix development in- 
clude Bald arthritic (Bda), Bareskin (Bsk), Rex (Re), 
and Cleft lip 1 (clfl). 
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